NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Reddit Rules and Rulers: Quantifying the Link Between Rules and Perceptions of Governance Across Thousands of Communities

https://doi.org/10.1609/icwsm.v19i1.35863

Leibmann, Leon; Weld, Galen; Zhang, Amy X; Althoff, Tim (June 2025, Proceedings of the International AAAI Conference on Web and Social Media)

Rules are a critical component of the functioning of nearly every online community, yet it is challenging for community moderators to make data-driven decisions about what rules to set for their communities. The connection between a community's rules and how its membership feels about its governance is not well understood. In this work, we conduct the largest-to-date analysis of rules on Reddit, collecting a set of 67,545 unique rules across 5,225 communities which collectively account for more than 67% of all content on Reddit. More than just a point-in-time study, our work measures how communities change their rules over a 5+ year period. We develop a method to classify these rules using a taxonomy of 17 key attributes extended from previous work. We assess what types of rules are most prevalent, how rules are phrased, and how they vary across communities of different types. Using a dataset of communities' discussions about their governance, we are the first to identify the rules most strongly associated with positive community perceptions of governance: rules addressing who participates, how content is formatted and tagged, and rules about commercial activities. We conduct a longitudinal study to quantify the impact of adding new rules to communities, finding that after a rule is added, community perceptions of governance immediately improve, yet this effect diminishes after six months. Our results have important implications for platforms, moderators, and researchers. We make our classification model and rules datasets public to support future research on this topic.
more » « less
Full Text Available
Making Online Communities ‘Better’: A Taxonomy of Community Values on Reddit

https://doi.org/10.1609/icwsm.v18i1.31413

Weld, Galen; Zhang, Amy X; Althoff, Tim (May 2024, Proceedings of the International AAAI Conference on Web and Social Media)

Many researchers studying online communities seek to make them better. However, beyond a small set of widely-held values, such as combating misinformation and abuse, determining what `better’ means can be challenging, as community members may disagree, values may be in conflict, and different communities may have differing preferences as a whole.In this work, we present the first study that elicits values directly from members across a diverse set of communities.We survey 212 members of 627 unique subreddits and ask them to describe their values for their communities in their own words. Through iterative categorization of 1,481 responses, we develop and validate a comprehensive taxonomy of community values, consisting of 29 subcategories within nine top-level categories enabling principled, quantitative study of community values by researchers. Using our taxonomy, we reframe existing research problems, such as managing influxes of new members, as tensions between different values, and we identify understudied values, such as those regarding content quality and community size. We call for greater attention to vulnerable community members' values, and we make our codebook public for use in future research.
more » « less
Full Text Available
IMBUE: improving interpersonal effectiveness through simulation and just-in-time feedback with human-language model interaction

Lin, Inna; Sharma, Ashish; Rytting, Christopher; Miner, Adam; Suh, Jina; Althoff, Tim (August 2024, ACL)

Full Text Available
"I will just have to keep driving": A Mixed-methods Investigation of Lack of Agency within the Thai Motorcycle Rideshare Driver Community

https://doi.org/10.1145/3653706

Tieanklin, Nussara; Breda, Joseph; Althoff, Tim; Heimerl, Kurtis (April 2024, Proceedings of the ACM on Human-Computer Interaction)

This paper presents a mixed-methods study of app-based motorcycle taxis in Thailand to explore the social dynamics of rideshare drivers and their exercised autonomy both through social pressure and a hostile work environment. As motorcycle taxis are open-air vehicles, drivers can be exposed to prolonged air pollution and other weather events, potentially impacting their health. In an initial quantitative study of server-side rideshare logs, we unexpectedly found that drivers do not exercise the autonomy provided by their rideshare platform to avoid air pollution events. This prompted a follow-on investigation through semi-structured interviews of both drivers and passengers in three provinces to explore why these drivers fail to experience the autonomy promised by gig-work in this context and elucidated further examples this lack of autonomy experienced by drivers. Our study sheds light on the social context that may constrain a driver's agency, including financial pressures, weather conditions, conflicts with local taxi organizations, and a false perception that drivers need to work around the ride assignment algorithm to avoid being blacklisted. We find that when leveraging app-based rideshare opportunities, drivers simultaneously perceive increased flexibility in their work hours and a lack of agency to prioritize their health and safety. We conclude with a discussion on potential interventions aimed at mitigating the forces preventing drivers from exercising their autonomy.
more » « less
Full Text Available
How Do Analysts Understand and Verify AI-Assisted Data Analyses?

https://doi.org/10.1145/3613904.3642497

Gu, Ken; Shang, Ruoxi; Althoff, Tim; Wang, Chenglong; Drucker, Steven M (May 2024, ACM)

Full Text Available
How Do Data Analysts Respond to AI Assistance? A Wizard-of-Oz Study

https://doi.org/10.1145/3613904.3641891

Gu, Ken; Grunde-McLaughlin, Madeleine; McNutt, Andrew; Heer, Jeffrey; Althoff, Tim (May 2024, ACM)

Full Text Available
Natural language processing for mental health interventions: a systematic review and research framework

https://doi.org/10.1038/s41398-023-02592-2

Malgaroli, Matteo; Hull, Thomas D; Zech, James M; Althoff, Tim (December 2023, Translational Psychiatry)

Abstract Neuropsychiatric disorders pose a high societal cost, but their treatment is hindered by lack of objective outcomes and fidelity metrics. AI technologies and specifically Natural Language Processing (NLP) have emerged as tools to study mental health interventions (MHI) at the level of their constituent conversations. However, NLP’s potential to address clinical and research challenges remains unclear. We therefore conducted a pre-registered systematic review of NLP-MHI studies using PRISMA guidelines (osf.io/s52jh) to evaluate their models, clinical applications, and to identify biases and gaps. Candidate studies (n = 19,756), including peer-reviewed AI conference manuscripts, were collected up to January 2023 through PubMed, PsycINFO, Scopus, Google Scholar, and ArXiv. A total of 102 articles were included to investigate their computational characteristics (NLP algorithms, audio features, machine learning pipelines, outcome metrics), clinical characteristics (clinical ground truths, study samples, clinical focus), and limitations. Results indicate a rapid growth of NLP MHI studies since 2019, characterized by increased sample sizes and use of large language models. Digital health platforms were the largest providers of MHI data. Ground truth for supervised learning models was based on clinician ratings (n = 31), patient self-report (n = 29) and annotations by raters (n = 26). Text-based features contributed more to model accuracy than audio markers. Patients’ clinical presentation (n = 34), response to intervention (n = 11), intervention monitoring (n = 20), providers’ characteristics (n = 12), relational dynamics (n = 14), and data preparation (n = 4) were commonly investigated clinical categories. Limitations of reviewed studies included lack of linguistic diversity, limited reproducibility, and population bias. A research framework is developed and validated (NLPxMHI) to assist computational and clinical researchers in addressing the remaining gaps in applying NLP to MHI, with the goal of improving clinical utility, data access, and fairness.
more » « less
Full Text Available
LabelAId: Just-in-time AI Interventions for Improving Human Labeling Quality and Domain Knowledge in Crowdsourcing Systems

https://doi.org/10.1145/3613904.3642089

Li, Chu; Zhang, Zhihan; Saugstad, Michael; Safranchik, Esteban; Kulkarni, Chaitanyashareef; Huang, Xiaoyu; Patel, Shwetak; Iyer, Vikram; Althoff, Tim; Froehlich, Jon E (May 2024, ACM)

Full Text Available
Understanding and Supporting Debugging Workflows in Multiverse Analysis

https://doi.org/10.1145/3544548.3581099

Gu, Ken; Jun, Eunice; Althoff, Tim (April 2023, CHI '23: Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems)

Multiverse analysis—a paradigm for statistical analysis that considers all combinations of reasonable analysis choices in parallel—promises to improve transparency and reproducibility. Although recent tools help analysts specify multiverse analyses, they remain difficult to use in practice. In this work, we identify debugging as a key barrier due to the latency from running analyses to detecting bugs and the scale of metadata processing needed to diagnose a bug. To address these challenges, we prototype a command-line interface tool, Multiverse Debugger, which helps diagnose bugs in the multiverse and propagate fixes. In a qualitative lab study (n=13), we use Multiverse Debugger as a probe to develop a model of debugging workflows and identify specific challenges, including difficulty in understanding the multiverse’s composition. We conclude with design implications for future multiverse analysis authoring systems.
more » « less
Full Text Available
Large-scale diet tracking data reveal disparate associations between food environment and diet

https://doi.org/10.1038/s41467-021-27522-y

Althoff, Tim; Nilforoshan, Hamed; Hua, Jenna; Leskovec, Jure (December 2022, Nature Communications)

Abstract An unhealthy diet is a major risk factor for chronic diseases including cardiovascular disease, type 2 diabetes, and cancer 1–4 . Limited access to healthy food options may contribute to unhealthy diets 5,6 . Studying diets is challenging, typically restricted to small sample sizes, single locations, and non-uniform design across studies, and has led to mixed results on the impact of the food environment 7–23 . Here we leverage smartphones to track diet health, operationalized through the self-reported consumption of fresh fruits and vegetables, fast food and soda, as well as body-mass index status in a country-wide observational study of 1,164,926 U.S. participants (MyFitnessPal app users) and 2.3 billion food entries to study the independent contributions of fast food and grocery store access, income and education to diet health outcomes. This study constitutes the largest nationwide study examining the relationship between the food environment and diet to date. We find that higher access to grocery stores, lower access to fast food, higher income and college education are independently associated with higher consumption of fresh fruits and vegetables, lower consumption of fast food and soda, and lower likelihood of being affected by overweight and obesity. However, these associations vary significantly across zip codes with predominantly Black, Hispanic or white populations. For instance, high grocery store access has a significantly larger association with higher fruit and vegetable consumption in zip codes with predominantly Hispanic populations (7.4% difference) and Black populations (10.2% difference) in contrast to zip codes with predominantly white populations (1.7% difference). Policy targeted at improving food access, income and education may increase healthy eating, but intervention allocation may need to be optimized for specific subpopulations and locations.
more » « less
Full Text Available

« Prev Next »

Search for: All records